Zero-Shot Scene Graph Relation Prediction Through Commonsense Knowledge Integration

نویسندگان

چکیده

Relation prediction among entities in images is an important step scene graph generation (SGG), which further impacts various visual understanding and reasoning tasks. Existing SGG frameworks, however, require heavy training yet are incapable of modeling unseen (i.e., zero-shot) triplets. In this work, we stress that such incapability due to the lack commonsense reasoning, i.e., ability associate similar infer relations based on general world. To fill gap, propose CommOnsense-integrAted sCene grapH rElation pRediction (COACHER), a framework integrate knowledge for SGG, especially zero-shot relation prediction. Specifically, develop novel mining pipelines model neighborhoods paths around external graph, them top state-of-the-art frameworks. Extensive quantitative evaluations qualitative case studies both original manipulated datasets from Visual Genome demonstrate effectiveness our proposed approach. The code available at https://github.com/Wayfear/Coacher.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Zero-shot Object Prediction using Semantic Scene Knowledge

This work focuses on the semantic relations between scenes and objects for visual object recognition. Semantic knowledge can be a powerful source of information especially in scenarios with few or no annotated training samples. These scenarios are referred to as zero-shot or fewshot recognition and often build on visual attributes. Here, instead of relying on various visual attributes, a more d...

متن کامل

Semantic Graph for Zero-Shot Learning

Zero-shot learning aims to classify visual objects without any training data via knowledge transfer between seen and unseen classes. This is typically achieved by exploring a semantic embedding space where the seen and unseen classes can be related. Previous works differ in what embedding space is used and how different classes and a test image can be related. In this paper, we utilize the anno...

متن کامل

Zero-Shot Relation Extraction via Reading Comprehension

We show that relation extraction can be reduced to answering simple reading comprehension questions, by associating one or more natural-language questions with each relation slot. This reduction has several advantages: we can (1) learn relationextraction models by extending recent neural reading-comprehension techniques, (2) build very large training sets for those models by combining relation-...

متن کامل

Zero-Shot Recognition via Structured Prediction

We develop a novel method for zero shot learning (ZSL) based on test-time adaptation of similarity functions learned using training data. Existing methods exclusively employ source-domain side information for recognizing unseen classes during test time. We show that for batch-mode applications, accuracy can be significantly improved by adapting these predictors to the observed test-time target-...

متن کامل

From Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge

In this paper we propose the construction of linguistic descriptions of images. This is achieved through the extraction of scene description graphs (SDGs) from visual scenes using an automatically constructed knowledge base. SDGs are constructed using both vision and reasoning. Specifically, commonsense reasoning1 is applied on (a) detections obtained from existing perception methods on given i...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-86520-7_29